Skip to content

Conversation

@scott-huberty
Copy link
Member

@scott-huberty scott-huberty commented Dec 24, 2024

RFC (Request For comments)

I am opening this as a PR instead of an issue just so that there is a workable example of what I am proposing, to facilitate discussion.

Background

When using the flag_channels_by_fixed_threshold method in the pipeline class, users are encouraged to specify a threshold that is appropriate for their data. However, finding the right threshold often involves a process of trial and error, and I am increasingly finding this to be tedious with the current API.

To streamline this process, I propose adding a standalone function that takes an epochs object and returns a list of "bad" channels based on a given threshold. This would allow users to iterate more quickly in determining an optimal threshold. Then, the method flag_channels_fixed_threshold just wraps this function, so that we aren't duplicating code.

Example

I.e. instead of doing this:

import mne
import pylossless as ll

fname = mne.datasets.sample.data_path() + '/MEG/sample/sample_audvis-ave.fif'
raw = mne.io.read_raw_fif(fname)

config = ll.Config().load_default()
config["flag_channels_fixed_threshold"] = {"threshold": 5e-5}
pipeline = ll.Pipeline(config)
pipeline.raw = raw
pipeline.flag_channels_by_fixed_threshold(epochs)
pipeline.flags["ch"].items()

You would like to be able to do

import pylossless as ll
import mne

fname = mne.datasets.sample.data_path() + '/MEG/sample/sample_audvis-ave.fif'
raw = mne.io.read_raw_fif(fname)
epochs = mne.make_fixed_length_epochs(raw)
for threshold in [1e-5, 5e-5, 1e-4]:
    print(ll.find_bad_channels_by_threshold(epochs, threshold))

If we find this design to be helpful, it could eventually implement it for the other pipeline steps. But for now, flag_channels_fixed_threshold isn't even a part of the default pipeline (we only implemented it in order to replicate the MATLAB pipeline for the paper), and so I think that it makes for a good guinea pig for this design (since few if any users probably use it).

@christian-oreilly WDYT?

@codecov
Copy link

codecov bot commented Dec 24, 2024

Codecov Report

Attention: Patch coverage is 95.65217% with 1 line in your changes missing coverage. Please review.

Project coverage is 84.49%. Comparing base (c985968) to head (8e5eb40).
Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
pylossless/pipeline.py 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #214      +/-   ##
==========================================
+ Coverage   83.43%   84.49%   +1.06%     
==========================================
  Files          26       26              
  Lines        1696     1716      +20     
==========================================
+ Hits         1415     1450      +35     
+ Misses        281      266      -15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@christian-oreilly
Copy link
Member

Please note that it is not flag_channels_by_fixed_threshold but flag_channels_fixed_threshold in the code.

Seems reasonable to me.

As is often necessary with processing steps using thresholds, we will probably want to implement at some point a function that estimate the optimal threshold based on the properties of the data, rather than using a fixed and arbitrary value. That is more or less what you are saying you are doing with a trial-and-error process for now.

@scott-huberty
Copy link
Member Author

As is often necessary with processing steps using thresholds, we will probably want to implement at some point a function that estimate the optimal threshold based on the properties of the data, rather than using a fixed and arbitrary value.

That would be awesome.

OK, I'll clean this up and get the CI's green, and then request a review from you.

@scott-huberty
Copy link
Member Author

Sorry for letting this sit so long.

It ended up being more LOC than I anticipated but a lot of that is docstring. Added a test and coverage is slightly increased, if that helps you feel assured that everything I've added here is tested.

@christian-oreilly ready for your review!

Copy link
Member

@christian-oreilly christian-oreilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor (comment/documentation-related) suggestions. One more substantial about avoiding to hardcode the 5e-5 default value at two places.

@scott-huberty scott-huberty merged commit 9052a63 into lina-usc:main Mar 18, 2025
8 checks passed
@scott-huberty
Copy link
Member Author

Thanks @christian-oreilly !!!

@scott-huberty scott-huberty deleted the find_bads_by_threshold branch March 18, 2025 15:11
Andesha pushed a commit to Andesha/pylossless that referenced this pull request Mar 31, 2025
* RFC, WIP: public function to find and return bad channels

* TST, DOC: Add test, clean it up

* TST: add flag_channels_fixed_threshold step to full pipeline run test

* Christian Suggestions (More DRY) [skip actions] [ci skip]

Co-authored-by: Christian O'Reilly <[email protected]>

* FIX: line length...

---------

Co-authored-by: Christian O'Reilly <[email protected]>
Andesha pushed a commit to Andesha/pylossless that referenced this pull request Apr 1, 2025
* RFC, WIP: public function to find and return bad channels

* TST, DOC: Add test, clean it up

* TST: add flag_channels_fixed_threshold step to full pipeline run test

* Christian Suggestions (More DRY) [skip actions] [ci skip]

Co-authored-by: Christian O'Reilly <[email protected]>

* FIX: line length...

---------

Co-authored-by: Christian O'Reilly <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants